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IS.  SURRLEMENT A RY  NOTH 

This  is  an  annotated,  edited  version  of  a  1982  survey  of  formula  score  theory. 
Sections  two  and  three  provide  an  outline  of  the  basic  theory.  The  remaining 
sections  deal  with  applications. 
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Latent  trait  theory,  item  response  theory,  formula  score  theory,  local  inde¬ 
pendence  test,  ability  distribution  estimation,  density  estimation,  item  bias, 
item  drift,  multi  dimension  items,  multidimensional  ability  distributions, 
unidimensionality,  test  security,  consistent  estimation,  maximum  likelihood 
estimation,  regression  function. - 
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formula  score  theory  (FST)  associates  each  multiple  choice  test  with  a  linear 
operator  and  expresses  all  of  the  real  functions  of  item  response  theory  as 
linear  combinations  of  the  operator's  eigenfuncitons.  Hard  measurement  problems 
can  then  often  be  reformulated  as  easier,  standard  mathematical  problems. 

For  example,  the  problem  of  estimating  ability  distributions  from  sequences 
of  item  responses  can  be  reformulated  as  maximizing  a  convex  index  of  goodness 
of  fit  defined  on  a  convex  set.  A  major  simplification  of  several  theoretical  _ 
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problems  has  been  obtained  because  the  linear  mathematics  used  by  the  theory  has 
a  well -developed  generalization  to  problems  Involving  many  variables.  For 
example,  a  battery  of  tests  measuring  several  related  variables  and  one  test 
measuring  one  trait  can  be  analyzed  with  essentially  the  same  theory. 

An  elementary  outline  of  the  basic  theory  is  presented  along  with  a  ■' 
discussion  of  several  illustrative  applications. 
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Preface 


This  report  is  an  edited,  annotated  version  of  a  paper  presented  toi 
the  Office  of  Naval  Research  Contractor's  Conference  in  1982.  Comments 
have  been  inserted  to  bring  the  report  up-to-date  and  to  make  it  easier 
to  read.  The  original  paper  will  eventually  be  distributed  in  the  confer¬ 
ence  proceedings  under  the  title  "The  Trait  in  Latent  Trait  Theory." 

This  version  is  being  released  now  to  serve  as  an  introduction  to  the 
theory. 
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AN  INTRODUCTION  TO 
MULTILINEAR  FORMULA  SCORE  THEORY 


Introduction 


A  list  of  potential  applications  of  the  theory  and 
an  overview  of  the  report  Is  given  In  this  section. 
Since  1982  work  has  begun  on  two  additional 
applications.  The  theory  provides  a  formula  for 
calculating  an  adaptive  test  score  that  has  the 
same  conditional  expected  value  as  the  number* 
right  score  for  a  conventional  test.  Tha  theory 
also  Is  being  used  to  formulate  a  way  to  estimate 
the  parameters  of  adaptive  test  Items  without 
interrupting  operational  testing. 


Just  what  Is  being  quantified  In  a  latent  trait  or  Item  response  theory 
analysis  of  a  mental  test?  A  good  theory  may  shed  light  on  the  following 
practical  problems. 

1.  Deciding  whether  two  tests  measure  the  same  trait  or  traits; 

2.  Analyzing  the  relative  contributions  of  a  pair  of  traits 
or  abilities  to  test  performance; 

3.  Detecting  “functional"  changes  In  Items  Including  those 
caused  by  security  problems,  mode  of  administration  changes 
and  changes  In  the  familiarity  with  the  concepts  supporting 
the  item  In  the  population  being  tested; 

4.  Determining  the  adequacy  of  an  "item  response  function," 
l.e.,  a  specific  mathematical  formula  relating  performance 
to  ability; 

5.  Discovering  the  shape  of  the  item  response  functions  including 
multidimensional  Item  response  functions; 

6.  Quantifying  the  magnitude  and  reliability  of  violations  of 

the  principal  assumption  of  latent  trait  theory,  "local  Independence; 

7.  Modelling  item  responses  (such  as  omitting  or  changing  answers) 
that  fall  to  be  locally  independent. 

Some  theoretical  results  bearing  on  these  problems  will  be  outlined. 

The  central  problem  for  the  new  theory  is  to  represent  traits,  abilities 
or  achievements,  and  their  distributions. 

The  theory  will  first  be  motivated  by  an  Informal  discussion  of  some 
of  Its  applications.  After  the  basic  theory  is  presented,  the  discussion  of 
applications  will  be  resumed.  Projected  work  and  work  In  progress  is 
also  described. 


AN  INTRODUCTION  TO 
MULTILINEAR  FORMULA  SCORE  THEORY 


Section  One:  Motivation 
Three  Practical  Problems 


This  section  should  be  skipped  on  first  reading.  It 
was  written  to  stimulate  interest  In  theory  by  showing 
its  relevance  to  some  important  applied  problems. 
Unfortunately,  it  is  hard  to  read  and  out-of-date  since 
our  new  parameter  estimation  programs  permit  us  to 
implement  the  applications  with  much  smaller  sample 
sizes.  It  contains  nothing  that  is  needed  to  under¬ 
stand  the  sections  that  follow. 
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Three  Important  measurement  problems  will  be  used  to  motivate  the  theory. 

An  attempt  has  been  made  to  keep  this  paper  self  contained.  However,  this 
section,  which  may  be  skipped  or  skinned,  requires  familiarity  with  two 
latent  trait  theory  terms,  "item  response  function"  and  "local  Independence." 

They  are  now  defined  for  the  special  situations  considered  in  this  section 
and  redefined  in  later  sections  where  more  generality  is  needed. 

The  item  response  function  (also  called  the  Item  characteristic  curve, 

ICC,  and  conditional  response  function)  Is  the  (conditional)  probability 
of  sampling  an  examinee  correctly  answering  the  Item  from  the  subpopulation 
of  all  examinees  at  a  particular  ability  level.  Thus,  if  ability  is  uni  dimensional , 

then  the  item  response  function  for  the  ith  item  on  a  test  is  the  real 

function 

P^(t)  *  the  probability  of  a  correct  response  to  item  i  from 
an  examinee  sampled  from  all  those  with  ability  =  t  . 

A  pair  of  items,  say  the  1th  and  jth  ,  are  said  to  be  locally  independent 
if  they  are  Independent  in  subpopulations  having  no  variation  in  ability,  i.e., 
if  for  all  ability  levels  t  , 

Probdtems  i  and  j  are  both  correct  (ability  =  t) 
equals  the  product  of  the  item  response  functions 

Pi(t)Pj(t)  . 

Three  hard  measurement  problems  inevitably  arise  in  the  maintenance  of 
testing  programs  that  attempt  to  give  more  or  less  the  same  test  year  after 
year  to  a  large  number  of  people.  Examples  of  such  programs  are  the  military 


entrance  and  placement  programs,  college  entrance  exams,  graduate  and  professional 


school  admissions  exams,  high  school  and  grade  school  aptitude  and  achievement 
tests,  and  Interest  measures  such  as  job  satisfaction  scales  used  In  Industrial 
settings.  Some  of  those  programs  test  over  a  hundred  thousand  examinees 
every  year. 

The  three  problems  —  functional  item  change,  item  response  function 
adequacy  and  local  Independence  failure  —  are  now  described. 

Functional  item  change:  An  item  may  function  differently,  i.e. ,  have 
different  psychometric  properties  In  two  test  administrations.  For  example,  a 
vocabulary  item  requiring  exposure  to  political  terminology  may  seem  relatively 
easy  In  a  presidential  election  year.  School  curriculum  changes,  security 
problems,  method  of  administration  change,  improper  coaching,  and  item  format 
change  also  may  result  in  functional  Item  change.  The  principal  question 
here  is  to  determine  to  what  extent,  if  at  all,  an  item  has  functionally 
changed. 

Item  response  function  adequacy:  Many  mathematical  formulas  have  been 
proposed  to  represent  item  response  functions.  Psychological  arguments  have 
been  used  to  challenge  the  correctness  of  each,  usually  over  an  ability  range. 
For  example,  monotonic  curves  have  been  criticized  for  giving  an  incorrect 
representation  over  a  low  ability  range  because  very  low  ability  examinees 
may  perform  somewhat  better  than  examinees  just  bright  enough  to  be  misled 
by  item  construction  tricks.  Curves  that  asymptope  to  one  have  been  criticized 
because  they  contain  no  provision  for  the  careless  mistakes  of  very  bright 
examinees  answering  items  beneath  their  ability  level.  Of  course,  with 
the  huge  samples  of  examinees  currently  available,  virtually  any  guess  or 
estimate  of  the  population  IRF  can  be  rejected.  The  goal  in  adequacy 


1.3 


problems  is  to  determine  whether  a  proposed  curve  is  "adequate,"  i.e.,  close 
enough  to  the  population  IRF  over  an  ability  range  to  be  acceptable  in  a 
specific  application. 

Local  independence  failure:  Psychological  reasoning  or  data  analysis 
can  sometimes  lead  one  to  suspect  that  the  local  independence  assumption  has 
been  seriously  violated.  For  example,  a  pair  of  reading  comprehension  items 
referring  to  the  same  reading  passage  may  be,  to  an  unacceptable  degree, 
measuring  familiarity  with  the  content  of  the  passage.  An  example  arising 
in  an  empirical  item  bias  study  is  described  later  in  this  section.  One 
application  of  the  theory  being  developed  is  to  determine  the  magnitude  and 
reliability  of  suspected  local  Independence  failures. 

After  some  preliminaries,  the  three  problems  will  be  considered  separately. 
Table  I  summarizes  the  discussion.  It  may  be  helpful  for  the  reader  to  refer 
back  to  it  from  time-to-time. 

TABLE  I  Summary  of  Formula  Score  Theory  Analysis  of  Three  Basic  Measurement  Problems 


Distribution  of  Test 


Problem 

Hypothesis 

Population  Parameter 

Statistic  n 

1. 

Functional  Change 

L*  =  P 

n  -  /£  (P-L)2 

Quadratic  function 
of  normal  variables; 

Non-central  case. 

2. 

IRF  Adequacy 

L  =  P 

n  =  f\  (P-L)2 

Quadratic  function 
of  normal  variables? 

Central  case. 

3. 

Local  Independence 

p,p2  =  p 

Vh 

p2-l2 

1 

i  -  %  (P-h4>2 

Quadratic  function 
of  normal  variables; 

Central  case. 
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It  will  be  shown  that  all  the  questions  can  be  reduced  to  a  single  question 
about  two  curves  or  functions:  How  close  Is  a  specified  (either  by  a  formula 
or  table)  function  L(-)  to  an  Incompletely  specified  function  P(0  ,  the 
population  item  response  function.  The  problems  are  hard  to  answer  because 
abilities  are  not  observed,  only  estimated. 

Consider  the  conceptually  simplest  problem  type,  item  response  function 
adequacy.  A  "three  parameter  logistic  function"  L  has  been  estimated  and 
offered  as  a  representation  of  the  "true,"  i.e.,  population  item  response 
function  P  .  The  psychometrician  Is  concerned  about  the  monotonicity  of 
P  and  suspects  that  L  fits  P  poorly  over,  say,  the  ability  range 
-3  <  9  <  -2  .  He  wishes  to  determine  how  far  apart  P  and  L  are  over 
this  range. 

An  intuitive  and  cowmonly  used  measure  of  the  distance  between  two 
functions  is  the  generalization  of  Euclidean  distance  given  by  the  root 
mean  square  of  function  values.  In  this  spirit,  an  attempt  will  be  made  to 
compute  a  point  estimate  and  confidence  interval  for 

n  *  r\  [P{t)  -  L(t)]2dt  . 

The  Interval  [-3,-2]  In  the  definition  of  n  Is  arbitrary.  The 
hypothesis  being  tested  and  the  sample  of  examinees  available  for  testing 
will  generally  suggest  a  different  center  and  width  of  the  "supporting" 
interval.  Short  intervals  give  more  specific  information  about  the  difference 
between  P  and  L  .  Very  short  intervals  give  estimates  of  n  with  a  large 
sampling  error. 


The  results  of  this  section  are  made  possible  by  an  elementary  equation 
which  is  valid  at  each  point  t  where  ability  densities  are  continuous. 

0)  [P(t)  -  L(t)]f(t)  •  f+(t)[l  -  L(t)]P  +  f"(t)L(t)0 

In  this  equation  P  and  L  are  as  before.  The  density  for  the  ability  0 
in  the  general  population  of  examinees  is  denoted  by  f  .  f+  is  the  ability 
density  in  the  subpopulation  of  examinees  passing  the  target  item;  f”  is 
the  conditional  density  in  the  failure  subpopulation.  P  =  1  -  Q  is  the 
proportion  of  item  passers. 

This  equation  is  Important  because  it  permits  one  to  evaluate  adequacy 
questions  without  estimating  abilities.  It  is  necessary  to  do  so  because  for 
a  test  of  fixed  length  any  estimate  of  ability  has  a  substantial  standard 
error,  a  bound  for  which  can  be  computed  by  routine  methods.  On  the  other 
hand,  subject  to  technical  qualifications  treated  at  length  below,  an 
arbitrarily  accurate  estimate  of  the  distribution  of  abilities  can  be 
obtained  with  a  sufficiently  large  sample  of  examinees,  and  test  administrations 
of  over  1,000,000  examinees  are  no  longer  uncommon.  In  later  sections, 
consistent  estimates  of  ability  densities  are  discussed. 

In  view  of  the  very  large  sample  sizes,  f  will  be  regarded  as  known. 

The  effects  of  small  errors  in  specifying  f  on  the  sampling  distribution  of 
our  estimate  of  n  has  not  been  worked  out  at  this  time. 

The  quantity  P  on  the  right  hand  side  can  either  be  computed  as 
f  L(t)f(t)dt  when  the  hypothesis  P  =  L  is  being  evaluated  or  estimated 
as  the  sample  proportion  of  item  passers. 
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In  most  applications,  only  moderately  large  samples  are  available  for 
estimating  the  conditional  densities  f+  and  f"  .  Using  Equation  (1),  one 
can  write 

(2)  n  *  {f+(t)[l-L(t)]P  +  f"(t)L(t)0}2W(t)dt 

where  the  weight  function  W(t)  Is  l/[f(t)]2  .  Upon  substituting 
estimates  ?*(•)  and  f“{*)  for  f+(*)  and  f“(-)  ,  one  obtains  a  statistic 

A 

n  , 

(3)  n  *  f\  {f+(t)[l-L(t)]P  +  f‘(t)L(t)Q}2W(t)dt 

that  can  be  used  to  evaluate  IRF  adequacy.  It  will  be  seen  that  n  has  a 
tractable  sampling  distribution. 

To  motivate  some  theoretical  developments  on  density  representation 
and  estimation  in  the  next  section,  suppose  the  conditional  densities  could 
be  represented  In  the  form 

+  J  + 

(4)  f  (0)  =  E  a.h.(8) 

j=l  J  J 
J  + 

(5)  f  (0)  *  l  a.h.(0) 

j=l  J  J 

for  known  functions  ^^.....hj  and  constants  aj,  aj  . 

Then,  after  the  indicated  integration  in  (2)  is  carried  out,  n  has  the 
particularly  simple  form 

(6)  n  - 
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where  %  Is  the  vector  <aj,. . .  ,aj,a^,. . .  ,aj>  and  Q  is  a  matrix  of 
numbers  that  can  be  calculated  prior  to  data  collection.  The  entries  in 
Q  are  obtained  by  substituting  (4)  and  (5)  into  (2),  expanding  the  product 
and  numerically  calculating  the  integral  of  the  product  of  the  specified 
functions.  It  is  easily  verified  that  Q  is  symmetric,  positive  semidefinite. 

Such  a  representation  has  been  derived.  The  functions  h.  are  derived 

J 

from  a  priori  considerations  given  in  the  next  section.  The  number  of  them, 

J  ,  turns  out  to  be  acceptably  small  —  between  4  and  8  —  for  the  tests 
already  analyzed. 

Consistent,  unbiased  estimates  for  the  vector  of  constants  are  described 
in  the  following  sections.  With  them,  one  obtains  estimated  densities 


f  (* )  *  E  ath.(-) 
j=l  J  J 


f’(.)  =  z  S-fM-)  . 

j=l  J  J 

Here  aj  and  ctj  are  estimates  of  the  corresponding  constants. 

The  vector  of  estimates  £  will  be  seen  to  be  multivariate  normal,  at 
least  asymptotically.  The  hypothesis  P  *  L  permits  one  to  calculate,  prior 
to  data  collection,  the  covariance  matrix  of  the  estimates  and  derive  the 
distribution  of 


n 


Random  variables  of  form  aQaT  where  a 
and  Q  positive,  semidefinite,  generalize  the 


is  multivariate  normal 

2 

X  family  of  random  variables. 
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In  the  "central  case,"  the  statistic  has  the  same  distribution  as  the  sum  of 
squares  of  several  independent  normal  variables  with  zero  mean  and  not 
necessarily  equal  variances.  In  the  "non-central  case,"  the  means  may  be 
unequal.  The  asymptotic  normality  and  the  hypothesis  P  *  L  make  the  central 
case  appropriate.  A  numerical  algorithm  has  been  developed  for  computing 
the  cdf  F(x)  *  Prob{n  <  x)  and  determining  the  probability  of  observing 

A 

an  n  equal  to  the  sample  value  or  larger  under  the  hypothesis,  n  =  0,  i.e. , 

P  =  L  (Williams,  1984).  For  a  review  of  alternative  algorithms  see  Kotz 
et.  al.,  1967a, b.  Technical  details  on  the  derivation  and  distribution  of 

A 

n  are  In  Levine,  1983. 

The  above  approach  can  be  used  to  attack  functional  item  change  questions. 

In  the  treatment  of  adequacy,  the  distribution  of  n  was  derived  under  the 
hypothesis  P  *  L  .  In  studying  change,  the  discrepancy  between  P  and  L 
Is  measured  under  they  hypothesis  that  P  »  L*  where  L*  is  some  specified 
function  other  than  L  .  Suppose,  for  example,  an  item  response  function 
has  been  carefully  measured  using  a  very  large  sample  and  that  years  of 
successful  experience  with  the  item  were  consistent  with  the  IRF  used  to  represent 
it.  The  hypothesis,  P  *  L*  has  considerable  support.  However,  after  a 
security  problem  comes  to  light,  a  re-estimation  with  a  smaller  sample  gives 
a  function  L  t  L*  .  Further  suppose  only  low  ability  examinees  are  motivated 
to  exploit  the  security  problem.  Under  the  hypothesis  P  =  L*  ,  how  large 
Is  the  squared  difference  between  P  and  L  expected  to  be  over  the  low 
ability  range?  By  a  generalization  of  the  arguments  described  above,  the 
distribution  of  n  can  be  derived.  It  turns  out  to  be  a  quadratic  function 
of  normal  variables,  non-central  case.  Formulas  for  the  variances  and  non-centrality 
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parameters  are  in  Levine,  1983. 

The  method  of  this  section  suggests  a  way  to  quantify  suspected  departures 
from  local  independence.  For  example,  in  an  item  bias  study  in  progress,  a 
vocabulary  Item  using  the  word  "hurl"  was  found  to  be  severely  and  reliably 
biased  against  sixth  grade  girls  and  In  favor  of  sixth  grade  boys  in  two 
Independent  samples  of  4,000  and  2,000  children.  It  seems  likely  that 
performance  on  another  item  also  using  a  word  favored  by  baseball  writers 
would  agree  more  with  the  hurl  item  score  than  predicted  by  the  local 
independence  assumption  of  latent  trait  theory.  To  analyze  the  causes  and 
consequences  of  bias  it  would  be  valuable  to  have  a  method  for  measuring  the 
magnitude  and  reliability  of  local  independence  violations  over  specified 
ability  ranges. 

To  test  for  local  Independence,  two  suspect  items  may  be  (conjunctively) 
paired  to  form  a  complex  item,  an  item  that  is  scored  correct  if  both  component 
items  are  correct  and  Incorrect  otherwise.  If  the  items  are  locally  independent, 
then  the  item  characteristic  curve  of  the  complex  item  will  be  the  product 
of  item  characteristic  curves  of  the  component  items.  Evidence  for  a  violation 
of  local  independence  would  be  small  n-j  =  [P^ ( t )  -  L^(t)]^dt  and 

n2  *  /J  [P2(t)  -  L2(t)]2dt  but  large  n12  *  [P1&2(t)  -  L-, (t)L2(t)]2dt. 

In  all  the  above  examples,  densities  and  conditional  densities  were 
represented  as  linear  combinations  of  a  finite  set  of  known  functions.  It 
will  be  shown  in  later  sections  that  every  density  is,  in  a  sense  soon  to 
be  made  precise,  equivalent  to  exactly  one  of  these  linear  combinations.  Every 
test  will  be  shown  to  have  associated  with  it  a  unique  "canonical  space"  or 
vector  space  of  functions  equivalent  to  densities.  It  is  hoped  that  this 
discussion  shows  that  a  theory  for  density  representation  and  estimation 
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can  be  used  to  attack  fundamental  Issues  In  psychological  measurement.  A 
new  theory  Is  outlined  In  the  next  section.  Some  further  applications  to 
basic  scientific  questions  follow. 


.  -rf.-. 


AN  INTRODUCTION  TO 
MULTILINEAR  FORMULA  SCORE  THEORY 


Section  Two:  The  Canonical  Space  and 

Equivalent  Ability  Distributions 


The  statistics  x.  described  in  this  section 
J 

and  the  next  were  being  used  in  1982  to  estimate 
"coordinates."  They  remain  Important  for  the 
theory  because  they  show  that  the  "Identifiable 
part"  of  a  density  can  be  estimated.  However, 
much  more  efficient  coordinate  estimation 
strategies  are  now  available  for  applications. 


2.1 


In  the  preceding  section  a  relation  Mas  noted  betMten  several 
hard.  Important  substantive  psychological  Issues  and'  the  more  routine 
methodological  problem  of  density  estimation.  The  approach  of  the 
previous  section  required  a  representation  of  ability  densities  as 
finite  linear  combinations  of  known  functions  and  multivariate  normal 
estimates  of  the  coefficients  In  the  combinations. 

In  this  section  an  a  priori  derivation  of  the  representation  Is 
reviewed  on  a  very  informal  level.  A  somewhat  more  formal  presentation 
of  the  derivation  is  outlined  in  the  next  section.  This  approach 
to  psychometric  problems  will  be  called  "formula  scoring"  or  "formula 
score  theory"1  and  abbreviated  FS  and  FST. 

The  analysis  is  organized  about  three  fundamental  theoretical 
issues  and  methodological  problems. 

1.  Ability  Distribution  Equivalence:  Which,  if  any,  pairs  of 
fundamentally  different  ability  distributions  lead  to  exactly 
the  same  probability  distributions  on  the  only  observables  in 
testing,  the  item  scores?  What  are  necessary  and  sufficient 
conditions  for  two  distributions  to  be  equivalent  (in  the  sense 
of  making  the  same  predictions)?  What  statements  about  the 
distribution  are,  in  the  technical,  foundations  of  measurement 
sense  of  the  term,  meaningful? 


The  term  "multilinear  formula  score  theory"  is  now  being  used  to 
distinguish  the  theory  from  earlier  work  by  other  authors  on 
1  inear  combinations  of  item  scores. 
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2.  Ability  Distribution  Representation:  Find  a  decomposition 
of  an  arbitrary  ability  density  into  two  uniquely  determined 
parts 

f(-)  -  f0(-)  +  f*(0 

such  that  two  densities  f1  and  f2  are  equivalent  if  and 
only  if  f*  =  f|  .  Find  a  finite  dimensional  parameterization 
of  the  "Identifiable  part"  f*  of  the  ability  density  f  . 


3.  Ability  Distribution  Identification  and  Estimation:  Show  that  the 

"identifiable  part"  of  the  ability  density  is  identifiable  in  the  sense 
that  a  consistent  estimate  of  f*(t)  exists  for  each  t  .  Construct 
an  estimator. 

The  results  of  this  section  are  derived  from  a  version  of  latent 
trait  theory  informally  presented  now  and  somewhat  more  formally 
stated  in  the  next  section.  The  major  random  variables  of  the  latent 
trait  model  are  abilities  e  and  item  scores  u^ ,u2,.. . ,un.  Examinees 
are  considered  to  be  randomly  sampled  from  an  infinite  population 
of  examinees.  The  "points"  in  the  probability  space  of  the  basic 
latent  model  are  examinees.  Each  examinee  has  a  specific  ability 
and  (non-random)  vector  of  Item  responses.  Abilities  and  item  responses  are 
non-trivial  random  variables  only  because  examinees  are  sampled. 

Item  responses  are  assumed  to  be  "locally  Independent",  i.e,, 
independent  in  the  subpopulations  of  examinees  defined  by  conditioning 
upon  ability.  Although  it  may  not  be  obvious  at  first  reading,  this 
conceptualization  of  latent  trait  theory  is  compatible  with  the  usual 
treatment  of  item  responses  as  independent  binomial  random  variables 
with  success  probabilities  that  are  functionally  dependent  on  abilities, 
provided  no  item  is  ever  administered  two  times  to  the  same  examinee. 


To  attack  the  problems  of  Identifying,  representing  and  estimating 
ability  distributions  from  a  foundations  of  measurement  point  of  view, 
the  set  of  all  statistics  for  a  test  Is  studied.  Since  only  the  item 
scores  u^  are  observed  and  since  examinees  work  independently  of 
one  another,  the  set  of  all  statistics  is  simply  the  set  of  number¬ 
valued  functions  of  the  Item  score  random  variables.  Moreover  this  set  can 
be  shown  to  be  a  finite  dimensional  vector  space. 

An  Important  tool  for  studying  ability  distributions  in  formula 
score  theory  is  the  canonical  space  of  a  test,  formulated  by  referring 
to  regression  functions.  The  regression  function  of  a  statistic  S 
is  the  real  function 


Rs(t)  *  E[S|e  -  t]  . 


The  canonical  space  (CS)  of  a  test  is  the  vector  space  of  all  regression 
functions.  It  Is  easily  shown  to  be  finite  dimensional.  In  fact,  in 
many  FST  applications  it  has  been  possible  to  treat  it  as  a  vector  space 
of  low  (less  than  8)  dimensionality.  (See  Section  4  for  further 
discussion  of  CS  dimensionality.) 

Before  proceeding,  several  assumptions  commonly  used  in  FST 


are  listed.  First,  the  "item  characteristic  curves" 


( t )  *  Probability  that  item  i  is  answered  correctly 
given  an  examinee  with  ability  equal  to  t  has 
been  sampled 


-  Efu^e  *  t) 


are  assumed  to  be  continuous.  In  addition,  all  abilities  are 
assumed  to  lie  in  a  closed  bounded  Interval  I  .  The  assumption  of 
continuous  P.  is  restrictive  and  may  have  to  be  dropped  for  some 


applications.  The  assumption  of  bounded  abilities,  on  the  other 
hand,  results  in  no  loss  of  generality  because  any  latent  trait 
model  can  be  reformulated  by  routine  methods  as  an  isomorphic  model 
with  bounded  abilities.  These  assumptions  together  imply  that  the 
canonical  space  consists  of  continuous  functions  on  the  interval  I  . 

A  major  result  of  FST  Is  that  two  densities  are  equivalent  In  the 
sense  of  (1.)  If  they  have  the  same  projection  into  the  canonical  space  . 
Thus  if  some  J  functions  h1,h2,...,hJ  form  a  basis  for  the  CS  and  if 

^Ihj(t)f1(t)dt  »  /jhj(t)f2(t)dt  j-1,2 . J 

then  there  is  no  objective  way  to  choose  between  f^  and  f^  . 

By  an  "objective  way  to  choose  between  f1  and  f2"  Is  meant  a 
method  of  using  the  observables  (the  Item  scores)  to  decide  which  of 
f1  or  f2  is  more  nearly  correct.  This  is  impossible  because  it  can 
be  proven  that  every  statistic  has  the  same  probability  distribution 
when  f1  Is  correct  as  when  f2  is  correct. 

This  fact  leads  to  a  useful  representation  of  densities.  An 
arbitrary  density  f  can  be  represented  uniquely  as 

J 

f(*)  *  fQ(*)  +  e  ^M*) 

°  j-i  J  J 
-  f0(*)  ♦  f*(0 

where  { h j >  is  a  basis  for  the  CS  and  fQ  is  orthogonal  to  the 
regression  function  of  every  statistic,  i.e., 

4  E(S|e  -  t)fQ(t)dt  *  0 

for  every  statistic  S  .  In  this  decomposition,  f  is  called  the 
null  part  of  f  ,  f*  the  identifi able  part  of  f  and  a. 

J 
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the  jth  coordinate  of  f  . 

The  Identifiable  part  of  f  is  indeed  Identifiable  because  a 
sequence  of  estimators  {?N(t}>  can  be  constructed  that  will  (almost  surely) 
converge  to  f*(t)  as  sample  size  N  Is  Increased.  The  convergence 
turns  out  to  be  uniform  in  t  . 

The  null  part  of  f  Is  null  In  the  sense  that  it  is  totally  unrelated 
to  data.  There  is  no  objective  way  to  use  the  adnlnlstered  items  to 
distinguish  two  densities  with  the  same  identifiable  parts  and  different 
null  parts.  Such  densities  cannot  and,  for  most  purposes,  need  not 
be  distinguished.  Both  densities  lead  to  the  same  predictions  in  all 
applications.  A  proposition  or  scientific  statement  that  is  true  If  f-, 
is  the  ability  density  and  false  if  f2  is  the  ability  density  is,  in  the 
technical,  foundations  of  measurement  sense  of  the  term,  not  meaningful. 

The  proposition  may  be  interesting,  clearly  stated  and  important,  but  there 
will  be  no  way  to  tell  if  it  Is  true  or  false  from  the  observed  responses 
to  the  test  items. 

The  representation  leads  to  a  strategy  for  estimating  densities. 

If  the  hj  (called  coordinate  functions)  are  orthonormal,  then  the 
coordinates  Oj  have  a  statistical  Interpretation, 

oj  *  EChjO)]  , 

i.e.,  cij  is  the  expected  value  of  a  function  of  the  unobserved  ability  0 

Since  hj  is  in  CS  ,  hj  is  the  regression  function  of  some 
statistic,  say  Xj  ,  and  its  regression  function,  the  conditional  expected 


value  of  Xj  ,  will  be  equal  to  hj 

E(Xjle-t)  =  hj(t)  . 

Since  Xj  is  simply  a  function  of  item  scores,  Xj  can  be  computed 
for  each  examinee  In  a  large  sample  of,  say,  N  examinees  to  obtain  a 
sample  mean  Xj  .  By  the  law  of  large  numbers  the  estimate  ?N(t) 

V‘> 

will  converge  (in  probability)  to  the  identifiable  part  of  f  as  the 
sample  size  N  becomes  large. 

This  estimate  Is  especially  well  behaved  and  easy  to  study  because 
the  examinees  are  Independently  sampled.  In  fact,  for  sufficiently  large 
sample  size  N  the  vector  of  sample  averages 

<)T^  ,^2**  •  • 

will  be  nearly  multivariate  normal  with  covariance  matrix  that,  at  least 
In  some  applications,  can  be  regarded  as  known. 

These  loosely  presented  observations  and  definitions  are  collected, 
suwnarized  and  extended  in  the  following  outline. 


The  preceding  section  is  now  outlined  for  clarity  and  ease  of 
future  reference.  Some  additional  assumptions  and  notation  are 
introduced  in  this  outline. 

Basic  Latent  Trait  Model  &  Notation  (Simplest  one-dimensional  version) 

n  *  {<*»}  *  the  probability  space 

*  an  infinite  set  of  actual  or  conceivable  examinees 
available  for  sampling  and  testing 

e  *  ability  random  variable.  e(«)  is  unobserved. 

f(-)  *  the  density  for  8.  Its  support  will  always  be  assumed 

to  be  contained  in  an  interval  I.  Except  when  noted,  only 
continuous  densities  are  considered. 

I  =  a  closed  interval  containing  all  abilities,  /j f ( t )dt  *  1 
n  3  number  of  test  items 

II  *  a  random  n-vector  of  item  scores,  the  only  observables 

*  <u^ .Ug,. . • .un> 

u^  *  item  score  random  variable.  u^U)  is  either  zero  or  one. 

Pj(-)  *  Item  response  function.  P^(t)  ■  E( u ^ 1 8  *  t).  Also  called 
an  item  characteristic  function.  These  functions  are 


assumed  to  be  continuous. 


Local  independence  assumption:  For  any  n-vector  of  zeros  and  ones 
U*  «  <uf,u|,...,u*> 

n 

ProMU  «  U*|e  -  t}  -  n  {ujp.(t)  +  (1  -  u?)[l  -  P.(t)]> 

i-1  1  1  1  1 

Statistic:  A  number-valued  function  of  the  item  scores. 

Basic  Formula  Score  Terminology 

Regression  Function:  The  regression  function  of  a  statistic  S 
is  the  conditional  expectation 

Rs(t)  »  E[S|e  »  t]  . 

Canonical  Space:  The  real  vector  space  of  all  regression  functions 
for  a  test.  Abbreviated  CS.  A  finite  dimensional  subspace  of 
the  vector  space  of  all  continuous  functions  defined  on  I.  It 
can  be  shown  to  have  dimension  *  2n. 

J:  The  dimension  of  the  canonical  space.  Discussed  in  Section  4. 

(•,•):  Notation  for  the  inner  product  used  on  the  space  of 
continuous  functions  defined  on  I.  (g,h)  *  g ( t ) h ( t )dt . 

Note  (h,f)  ■  E[h(e)]. 

Coordinate  Functions:  An  orthonormal  basis  for  the  CS  of  a  test. 
Generally  denoted  {h^ ,h2,. . . ,hj>. 

Oj:  The  projection  of  the  ability  density  on  the  jth  coordinate 
function,  hj.  Called  the  jth  coordinate  of  f.  Has  statistical 
interpretation  ■  E[hj(e)]. 
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Ability  Density  Equivalence 

Xfl('):  The  indicator  function  of  the  set  A 

XA(t)  *  0  if  t  is  not  in  A 
«  1  if  t  is  In  A 

P[- "»$,•]:  Notation  for  the  probability  distribution  of  the  statistic  S 

PfA;S,f.j]  is  the  probability  that  the  statistic  S  is  in 
the  set  A  when  f^  is  the  density  for  8  . 

PCAiS.f^  -  /ECxa(S)|0  -  t]f1(t)dt 

Equivalent  Densities:  Two  densities  f,,  f2  are  equivalent  if 
for  every  statistic  S 

Pt-jS.f^  -  P[-;S,f2] 

i.e. ,  If  the  probability  distribution  of  each  statistic  is  the  same 
when  f  ■  f1  as  when  f  *  f2  . 

Characterization  of  Equivalent  Ability  Densities:  f-j  is  equivalent 
to  f2  if  and  only  If 

(f-j.hj)  *  (f2,hj ^  for  j  *  1  *2*. « • 

for  any  set  of  coordinate  functions  {h.} 

J 


Abi 1 ity  Density  Decomposition 


9  *  90  +  9*:  Every  density  g  on  I  can  be  expressed  uniquely 
In  the  form 

9(0  *  90(«)  +  9*(«) 


where  g*  is  in  the  canonical  space  and  for  every  statistic  S 
(Rj»9)  ~  (Rg»9*) 

g*:  The  identifiable  part  of  a  density  g.  The  projection 
of  the  density  into  the  CS. 

J 

g*(t)  *  I  tt,h.(t) 
j»1  J  J 

for  coordinate  functions  h^,h2t...hj  . 

gQ:  The  null  part  of  a  density  g. 

g0  »  g  -  g*. 

Cannot  and  generally  need  not  be  estimated  because  f^  -  f| 
implies  P[‘*,S,f-|]  ■  P[‘;S,f2]  for  every  statistic  S. 

Ability  Density  Representation 

Densities  and  J  vectors:  The  mapping 

g  *►  <(g»h-j ) » C9 *^2)  *  ♦  •  •  »(g»hj)> 

associates  each  density  on  I  with  a  unique  0  vector. 
Densities  associated  with  the  same  J  vector  are  equivalent. 


Consistent,  Unbiased  Estimates  of  the  Identifiable  Part  of  the 
Ability  Density 

Xji  A  statistic  with  regression  function  equal  to  h^.; 

i.e.,  Rx(->»  V>-  There  must  be  at  least  one  because 
j 
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h.  is  in  the  CS  and  the  CS  consists  of  regression 

J 

functions  only. 

X.  must  be  bounded  because  it  has  all  of  its  probability 

J 

on  a  set  of  2n  points. 

E(Xj)  »  oj:  Follows  from 

E(Xj )  *  E[E(Xj|e)] 

-  E[R«(e)] 

-  EOye)] 

•  (hj.f) 

Xj  N:  Sample  mean  of  X^  from  a  sample  of  N  examinees. 

^  *  <)Tj  N,X2  n,7j  n>:  Sample  mean  of  N  bounded  Independent, 
identically  distributed  random  vectors.  Converges  to 
<a-|  ,a2* .  • .  ,aj>.  Asymptotically  N*  ^  is  multivariate  normal . 

Vt):  “  Jf1\j,Nhj(t) 

a  consistent,  unbiased  estimate  of  the  identifiable  part  of 
the  ability  density.  Asymptotically  N*  f^(t)  is  normal. 

Construction  of  Coordinate  Functions  h^  and  Coordinate  Estimators  X^.: 


See  Section  4. 


AN  INTRODUCTION  TO 
MULTILINEAR  FORMULA  SCORE  THEORY 


Section  Four:  Implementation  of  Ability  density  Results 


For  some  applications  we  have  been  able  to  improve 
upon  CFSM  (described  below)  by  developing  a  paral¬ 
lel  theory  in  which  the  item  scores  u.  are  re¬ 


placed  by  the  scores  w.=2u.-l  .  The  "complete" 
set  of  statistics  is  the  set  of  all  products  of  these 


Wj  instead  of  u.  . 
regression  functions 


The  components  of  CFSM  were 


R  (•>■?,(•) 

Uj  i 


an  eas i ly 


computed  formula  for  associating  item  response 
patterns  with  continuous  functions, 

V(t)-Jl[l+UjPj  (t)]  ,  and  an  operator  defined  by  the 

function  H(s ,t)“E[V(t) | 0“s]"H[l+Pj (s)Pj (t)]  . 


The  analogous  components  for  the  new  method  are 
described  at  the  end  of  this  section. 


The  abstract  results  given  In  the  preceding  sections  are  being 
used  now  to  estimate  densities  and  Item  characteristic  curves.  This 
section  Is  Included  to  show  In  a  general  way  how  the  theory  Is  used  to 
analyze  data.  The  discussion  Is  organized  about  four  technical 
questions  that  commonly  arise  In  response  to  presentations  of  the 
theory. 

1.  How  are  the  coordinate  functions  hj,h2,...,hj  determined 
In  actual  applications? 

2.  How  are  the  statistics  Xj  specified? 

3.  What  is  the  dimension  J  and  how  Is  It  determined? 

4.  Can  the  calculations  be  arranged  In  a  way  to  avoid  very  long, 
numerically  unstable  calculations? 

A  set  of  statistics  {S^}  Is  called  complete  If  any  statl  tic 
S  can  be  written  as  a  linear  combination  of  finitely  many  of  them. 

For  example,  the  elementary  formula  scores  { v^}  formed  by  considering 
all  products  of  Item  scores  are  complete.  These  are  the  scores 

1 

^1  .Ug » • • •  ,un 

ulu2*  * ' '  ,un-lun 


ulu2"*un  ' 
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The  regression  functions  R  (t)  3  E[v.  je  3  t]  are  simply  products  of 

vk  K 

itr  •  characteristic  curves  P^. 

A  set  of  coordinate  functions  can  be  constructed  from  any  complete  set 
of  statistics  as  follows.  First  a  function  on  I  x  I  is  specified  by  the  formula 

H(s,t)  3  z  R.(s)  R.(t) . 
k  \  \ 

"Is  rpecified"  in  practical  terms  means  that  a  computer  subroutine 
is  written  that  accepts  pairs  of  numbers  s,t  as  input  and  returns  the 
number  given  on  the  right  hand  side  as  output.  In  the  special  case 
where  (S^ )  3  { v^> .  the  "elementary"  formula  scores,  it  can  be  shown 
that  H  has  the  easily  calculated  form, 

n 

H(s,t)  3  H  [1  ♦  P,(s)Pi(t)]. 
i»l  1  1 

This  special  case  will  be  discussed  after  the  general  case  is  treated. 

Using  standard  methods  H(s,t)  is  decomposed  into  a  finite  sun  of 
products  of  orthonormal  functions.  More  specifically,  a  set  of  positive 
numbers 


X-|  ^  ^  ^  ® 

and  orthogonal  functions  h.,  j  3  1,2,...,J 

J 

fh  h  )  »  I1  if  J  *  j’ 

'nJ*  j  ;  \0  otherwise 

are  computed  such  that 

J 

(*)  H(s,t)  3  l  X.h.(s)h.(t) 
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for  all  s,t  in  I  .  Just  as  the  eigenvalues  of  a  positive  seraidefinite 
matrix  are  determined  by  the  matrix,  the  X's  and  the  "rank"  J  are 
determined  by  H  .  It  can  be  shown  that  every  function  in  the  CS  is 
a  linear  combination  of  the  hj  no  matter  which  complete  set  of  statistics 
is  used  to  construct  H  .  In  other  words,  the  h.  are  coordinate  functions, 

J 

and  J  is  the  dimension  of  the  CS  . 

To  construct  an  estimator  of  the  coordinate  ou  a  E[h.(9)],  note 

J  J 

that  hj  is  an  eigenfunction  of  the  "linear  operator"  defined  by  H  . 

In  symbols. 


A j h j ( t )  *  I  H(s*t)  hj(s)  ds 

a  E  R.(t)  /  R-(s)  h.(s)  ds 
k  ^k  J 

«  £  Mt)  (R.  »h.)  . 
k  5k  J 


If  Rs(t)  *  E[Sk|e  *  t]  is  replaced  by  Sk/Xj  in  this  formula,  a 


statistic  X.  is  specified, 

J 

x,  *  x:1  E  S.  (R  ,h.). 

J  J  k  *  * 


Since  E(S.|e  *  t)  is  R.(t)  it  follows  that  E(X.|e  ■  t)  -  h.'(t) 
K  H  J  j 


and  E ( X . )  *  E[h .(e)], 

J  J 


Thus  the  sample  mean  X.  is  a  consistent,  unbiased  estimator  of 

J 

the  coordinate  a.. 

J 

J,  the  dimensionality  of  the  CS  of  the  test,  was  calculated  by 
analyzing  any  complete  set  of  statistics  { Sk > .  J  turned  out  to  be 
the  "rank"  of  the  linear  operator 

h  -  r(h)  »  /  e  Re(*)Rc(t)h(t)dt  a  /  H(-  ,t)h(t)dt  . 
k  5k  \ 


Although  the  eigenfunctions  h.  and  the  eigenvalues  Aj  depend  on  the 
choice  of  {Sk>,  J  does  not.  However,  in  some  applications  we  may  be 
led  to  particular  (Sk>  and  subsequently  to  a  decision  to  treat  the  CS 
as  if  It  had  dimension  J*  <  J  . 

The  typical  situation  arises  when  one  considers  the  problem  of  selecting 

A 

f  so  as  to  minimize  the  quadratic  Index  of  goodness  of  fit 


Q(f)  -  E  CTk  -  E(Sk;f)r. 

Here  Is  the  sample  average  value  of  Sk  and 

E(sk;f)  -  j^Rs(t)f(t)dt 

J 

is  the  predicted  value  of  fk-  If  H(s,t)  *  £  Ajhj(s)hj(t) 

j  *  1 

then  Q  can  be  written  in  the  form 
«  J  _  2 

(**)  q(f)  »  e  x .[fi,  -  X.]  +  terms  that  are  independent  of  f 
j-1  J  J  J 

where  <£j  *  (f.hj)  and  *  aJ1  £  ^(Rsk,hj^ 

From  formula  (**)  it  can  be  seen  that  the  size  of  Aj  measures 
the  degree  of  improvement  of  fit  to  a  set  of  statistics  { Sk>  that 
can  be  obtained  by  Including  one  more  term  in  the  representation  of 
the  identifiable  part  of  the  ability  density 


£  oij ihj i (• ) • 

j'<j  J  J 


If  Xj  is  very  small  or  7^  has  very  large  sampling  error,  then 
we  generally  do  not  attempt  to  estimate  the  coordinate  and  proceed 


as  if  the  canonical  space  had  lower  dimensionality  than  J. 


In  applications  we  have  been  selecting  J'  by  computing  the 
Xj  and  treating  very  small  Xj's  as  zero.  As  a  check  on  the  adequacy 

J* 

of  this  procedure  we  examine  the  difference  between  z  (g,h.)h,(.) 

j*l  J  J 

and  g(*)  for  various  functions  g.  The  functions  g  we  generally 
consider  are  the  P^,  selected  regression  functions  and  various 
guesses  about  f(-).  The  two  functions  will  agree  exactly  if  g  i$  in 
the  CS  and  J'  is  the  dimensionality  of  the  CS. 

This  section  is  concluded  with  a  brief  discussion  of  a  formula 
scoring  technique  that  has  proven  more  powerful  and  accurate  than  all 
of  the  other  techniques  we  have  tried.  The  complete  formula  score 
method  (CFSM)  uses  the  elementary  formula  scores  {v^}  as 
its  complete  set  of  statistics  and  begins  with  the  identity 

H(s,t)  -  z  R  (s)R  (t) 
k  vk  vk 

n 

3  *  Cl  +  P,(s)P.(t)]. 
i*l  1  1 

(The  identity  is  verified  by  induction  or  by  expanding  the  product.) 

The  sum  has  2n  terms,  but  the  product  has  only  n  terms  and  thus 
can  be  accurately  calculated  with  many  fewer  operations. 

The  Xj  can  also  be  calculated  with  a  variant  of  this  identity. 
Each  sampled  examinee's  data  is  transformed  to  define  a  random 
continuous  function  V(t),  which  is  called  the  V-transform  of  the 
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examinee's  data.  This  function 

V(t)  -  !  [1  +  u.P.(t)] 
i*l  1  1 


is  easily  calculated  and  is  identically  equal  to 
2n 

£  yut)  . 

k-1  K  vk 


Therefore  V(t)hj(t)dt  equals 


i 


X.X..  This  reduces 

J  J 


the  calculation  of  X.  from  2n  operations  to  one  numerical  integration. 

J 

In  fact,  to  calculate  the  sample  mean  Xj  only  one  integration  need  be 
done.  This  is  true  because  the  sample  average  of  the  integrated 
V(t)  is  the  integral  of  the  sample  average 


Ave(  /V(t)hj(t)dt}  «  /[Ave  V{t)]hj(t)dt. 

Thus  in  applications,  V(t)  is  computed  on  a  grid  of  t  values  for 
each  examinee,  accumulated  over  examinees  and  numerically  integrated  to 
obtain  the  J  sample  means  X^,^,...,)^.  This  procedure  can  be  adapted 
to  compute  sample  covariances  of  the  Xj  by  numerical  integration. 


Continuation  of  Notes  on  Section  Four 


A  Generalization  of  CFSM 


Components  of  CFSM: 


V(*)  =  nCi+u.P, (-)] 

H(s,t)  =  e[v (t) |6=s] 


Analogues  for  LFSM: 


R  (•)  =  2P.  (*)-l 
wi  1 

L (* )  =  n  [Hw,Rw  (•)] 

i  =  l  1  wi 
n 

G(s,t)  =  n  [l+R  (s) R  (t)]  . 

w.  w. 

1  =  1  I  I 


Note  that  the  likelihood  function  is  proportional  to  L  . 

Thus  in  applications  requiring  a  representation  of  likelihood 
functions  as  linear  combinations  of  a  small  number  of  coordinate 
functions,  the  eigenfunctions  of  G  (or  a  modification  of  G 
used  to  avoid  numerical  problems)  are  now  commonly  used. 


AN  INTRODUCTION  TO 
MULTILINEAR  FORMULA  SCORE  THEORY 


Section  Five:  Discovering  the  Shapes  of 
Item  Response  Functions 


The  "density  ratio"  estimation  procedure  on  the 
following  pages  was  improved  by  implementing  the 
first  two  refinements  at  the  end  of  the  section. 
In  the  process  of  implementing  the  third,  an 
alternative  strategy  was  discovered.  The  new 
strategy  has  the  generality^density  ratio 
estimation,  but  it  makes  much  more  efficient  use 
of  data.  The  key  idea  is  sketched  at  the  end  of 
this  section. 


5.1 


Sometimes  —  but  not  always  —  the  shape  of  item  characteristic 
curves  can  be  rationally  deduced  and  parameterized.  For  example,  the 
S-shape  of  the  three- parameter  logistic  curve  on  an  unbounded  ability 
continuum  follows  from:  (1)  Monotonicity  (more  able  examinees  are 
more  likely  to  answer  correctly),  (2)  Asymptotes  (although  the 
probability  of  a  correct  response  can  be  made  arbitrarily  close  to 
one  by  sampling  from  very  high  ability  subpopulations,  a  substantial 
proportion  of  each  low-ability  subpopulation  will  select  the  correct  option  of 
a  well -constructed  multiple  choice  item.)  (3)  Simplicity/Parsimony  (the 
item  response  curve  has  no  more  points  of  inflection  than  the  one  implied 
by  (1),  (2)  and  smoothness  conditions.)  (4)  Symmetry  (the  graph  of  the  item 
characteristic  curve  is  symmetric  for  high  and  low  ability  levels  in  the  sense 
that  a  length-preserving  transformation  (x,y)  (2xQ-x,2y Q-y)  about  a 

point  of  inflection  on  the  graph  (xQ,yo)  carries  the  curve  into  itself.) 

Fitting  three  parameter  logistic  functions  is  sensible  when 
these  conditions  are  met  because  every  curve  satisfying  these 
conditions  will  be  close  to  at  least  one  three-parameter  logistic 
function. 

Sometimes  these  assumptions  are  implausible  or  clearly  false,  and 
a  method  is  needed  to  discover  and  parameterize  shape.  As  a  one¬ 
dimensional  example,  it  was  demonstrated  (Levine  and  Drasgow,  1983)  with 
a  very  large  sample  of  aptitude  test  examinees  that  the  conditional 
response  function  for  the  response  of  choosing  a  particular  (wrong) 
option  on  a  multiple  choice  test  for  several  items  has  this  bow  shape:  -A_. 


5.2 


In  multidimensional  measurement,  the  shape  of  the  item  characteristic  surface 
is  a  matter  of  considerable  psychological  Importance  because  It  represents  a 
statement  about  how  several  abilities  interact  to  simultaneously  determine 

* 

response  probability.  In  the  next  section,  the  methods  discussed  in  this  section 
are  used  to  develop  a  procedure  for  determining  the  shape  of  an  item 

characteristic  surface. 

Another  application  of  the  method  described  in  this  section 
is  the  study  of  item  responses,  such  as  omitting,  that  cannot  reasonably 
be  expected  to  satisfy  local  independence.  FST  permits  one  to  construct  a 
consistent  estimate  of  an  "omitting  characteristic  curve" 

P{1tem  i  is  omitted|0  *  t> 

without  assuming  local  independence  for  omitting  responses. 

The  basic  issues  addressed  in  this  segment  of  our  research  are 

1.  What  is  the  shape  of  a  new  item  or  item  type's  item 
characteristic  curve? 

2.  What  information  about  ability  is  contained  in  wrong  answers 
and  item  omitting? 

3.  Model  item  responses  that  may  not  be  locally  independent 
such  as  omitting  and  also  the  following  examples  drawn  from 
computer  administered  tests: 

(i)  attempting  to  change  an  answer 


(ii)  requesting  a  display  of  a  previously  presented 
item  or  part  of  an  item 


(iii)  responding  in  a  time  clearly  too  short  to  read  the 


item 


Equations  Relating  FST  to  ICC  Shape:  Our  results  in  this  area  depend  on  the 
following  elementary  relation  which  is  used  to  reduce  Item  response  function 
estimation  to  ability  density  estimation: 

Prob{correct  response (ability  is  in  set  A} 

Prob{ ability  is  in  A | correct  response} 

- -  x 

Probability  is  in  A} 

Probcorrect  response} 

Thus  the  conditional  response  probability  is  proportional  to  the 
ratio  of  the  ability  distribution  in  the  item-passing  subpopulation 
to  the  unconditional  distribution.  (The  constant  of  proportionality, 
Probcorrect  response}  »  V  ,  has  an  obvious  consistent  unbiased 
estimate,  the  sample  proportion  correct.)  If  regularity  assumptions 
are  made  then 

f+(t)_ 

P(t)  *  — — -  ? 

f(t) 

where  f+  a  the  ability  density  in  the  subpopulation  of  item-passers 
f  *  the  ability  density. 


This  equation  can  be  written  in  the  form 


P(t) 


1 


an 

f+(t) 
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where  Q  =  1  -  P  and  f~  Is  the  failure  density. 

This  formula  is  especially  useful  for  developing  a  distribution 
theory  for  the  estimates  because  the  failure  and  passing  subpopulations 
are  disjoint.  Thus  the  distribution  of  an  estimator  of  f+(t)/f"(t) 
can  be  developed  by  studying  the  ratio  of  statistically  independent 
random  variables. 

Several  variations  of  this  approach  to  item  characteristic  curve 
estimation  have  been  tried  with  generally  satisfactory  but  occasionally 
poor  results.  Systematic  comparisons  of  the  variations  will  be  made  after 
more  exploratory  work.  Some  current  and  projected  refinements  are  sketched 
below. 


(1)  If  ability  is  uniformly  distributed  or  if  f(-)  is  constant 

on  a  range  of  abilities  of  interest,  then  the  ICC  is  proportional 
to  f+  and  ICC  estimation  is  simplified.  Furthermore,  sampling 
fluctuation  is  unlikely  to  give  a  zero  or  negative  density 
estimate.  An  Initial  approximation  can  be  used  to  transform 
ability  so  that  f(*)  is  approximately  constant. 

(2)  The  sampling  distribution  of  coordinate  estimates  depends 
on  the  choice  of  {S^}  .  We  have  observed  that  if  h-j  , 
the  coordinate  function  with  the  largest  eigenvalue  is 
close  to  f(*)  then  very  good  estimates  of  f(’)  are 
obtained.  Currently  an  attempt  is  being  made  to  capitalize 
on  this  effect  by  carefully  choosing  {S^}  and  controlling 

the  eigenfunction  shapes. 

(3)  The  current  density  estimates  are  least  squares  in  the  sense 
that  they  minimize  the  residual  sum  of  squares  Q  (Section  4). 

The  results  on  density  equivalence  permit  the  expression  of 

the  likelihood  function  as  a  function  of  finitely  many  parameters, 
the  coordinates  oij  .  In  principle,  one  could  maximize  this 
expression  and  compute  maximum  likelihood  density  estimates. 


5.5 


The  basic  equation  used  to  relate  item  characteristic  curves 
to  density  ratios  is  essentially  the  definition  of  conditional  probability. 
Local  independence  plays  no  role  in  its  derivation.  In  fact  the  binary 
item  response  on  the  focal  item  is  being  used  merely  to  divide  the  sample 
into  item  passers  and  item  failers.  The  equation  would  remain  valid  if 
any  binary  score  were  used  to  dichotomize  the  sample  and  population. 

Therefore  the  same  estimation  techniques  used  to  estimate  item  characteristic 
curves  could  be  used  to  study  the  relation  of  ability  to  complex  item  responses 
that  failed  to  satisfy  the  local  independence  assumption.  These  include  item 
skipping  and  very  fast  responding. 
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Continuation  of  Notes  on  Section  Five 
Maximum  Likelihood  FST  Item  Response  Curve  Estimation 


If  P(t) 


J  + 

E  a.h  (t) 

j-1  J  J 


is  the  item  response  function  for 


item  score  u  and 


£(U*;  t)  **  Prob{U«U*|0*t} 

3  3 

is  the  likelihood  function  for  examinee  a's  response  pattern 
U*  on  calibrated  items,  and  f  is  the  ability  density,  then 

3 

Prob{U=U*  &  u*u*} 
a  a 

-  /{(u*P(t)  +  0-u*)[l-P(t)]K(U*;  t)f(t)dt 

3  3  3 

-  /{(2u*-l)Eath.(t)  +  u*H(U*;  t)f(t)dt 

a  j  j  a  a 

-  {u*/£(U*;  t)f(t)dt)  ♦  E  at  (2u*-i  )/h,  (t)Jt(U*;  t)f(t)dt 

a  a  j  J  a  J  a 

is  a  linear  function  of  the  unknown  a*  .  Thus  the  log  like¬ 
lihood  function  E  log  Prob{U“U*  &  u®u*}  is  convex. 

3  3 

3 

Furthermore,  the  set  of  vectors  (ot+:  0<Ea"!h.  ( t ) <  1 }  is  convex. 

~  J  J 

Thus  maximum  likelihood  coordinate  estimates  can  be  computed  by 
solving  a  standard  mathematical  problem:  maximize  a  convex  function 
on  a  convex  set. 

Maximum  likelihood  is  now  being  used  to  estimate  item 
response  functions  and  ability  distributions. 


AN  INTRODUCTION  TO 
MULTILINEAR  FORMULA  SCORE  THEORY 


Section  Six:  Multidimensional  Formula  Scoring 
for  Homogeneous  Subtests 


The  key  idea  In  this  section  is  the  representa¬ 
tion  of  the  canonical  space  of  two  tests  as  the 
tensor  product  of  the  canonical  spaces  for  each 
test.  With  this  representation  the  application 
of  one  dimensional  results  to  estimate  multi¬ 
variate  distributions  is  straightforward.  In 
particular,  previously  discussed  "density  ratios" 
or  maximum  likelihood  coordinate  estimates  can  be 
used  to  estimate  bivariate  item  response  surfaces. 
It  should  have  been  emphasized  that  the  multi¬ 
variate  ability  distributions  are  intrinsicly 
important. 


There  Is  an  interesting  and  Important  multidimensional  measurement 
problem  that  can  be  Implemented  with  currently  available,  uni  dimensional 
software.  The  problem  is  to  discover  how  several  abilities  jointly  determine 
response  probability  on  new  Item  types.  This  can  be  done  when  certain 
psychological  assumptions  are  valid.  In  particular.  It  is  necessary  that 
a  variety  of  Item  types  are  available,  that  all  items  depend  on  the  same  small 
number  of  abilities  and  that  items  can  be  grouped  Into  "homogeneous"  subtests. 

For  concreteness  consider  three  item  types:  synonyms,  antonyms  and 
analogies.  Suppose  that  all  three  depend  only  on  a  pair  of  abilities  6^, 

02  In  the  sense  that  for  any  ability  levels  s,.t  and  any  r  Items,  the  item 
scores  u4  ,  u*  ,...,u<  satisfy 

’1  h  V 

r  r 

E[n  u4  |e,  ■  s  &  0,  *  t]  *  n  E[u,  |e.  *  s  &  *  t]  • 

j«i  *j  1  2  j-i  1  2 

In  other  words  the  items  are  Independent  In  subpopulations  formed  by 
conditioning  on  both  abilities. 

Homogenous  subtests,  defined  more  formally  below,  are  uni  dimensional 
subtests  of  a  multidimensional  test.  For  example  both  synonym  and  antonym 
Items  may  require  both  language  fluency  e-j  and  an  ability  to  recognize 
and  generalize  abstract  relations  .  But  the  antonym  items  can  be 
written  In  such  a  way  as  to  demand  a  relatively  large  amount  of  the  second 
ability.  Thus  synonym  items  may  satisfy  local  independence  with  respect 


6.2 


to  some  linear  or  nonlinear  function  of  6^  and  62  ,  say  6-j  +  62 

and  antonym  items  with  respect  to,  say  6^  +  202  .  Subtests  consisting 

of  one  item  type  only  will  appear  uni dimensional,  but  the  total  test  will  not. 

Note  that  the  assumption  that  Item  types  form  homogeneous  subtests 
Is  more  general  (and  more  believable)  than  the  assumption  that  different 
item  types  measure  different  traits.  This  should  be  obvious  after  the 
discussion  of  "ad  hoc  coordinates. " 

If  certain  plausible  assumptions  (written  out  below)  are  correct  then 
off-the-shelf  uni  dimensional  parameter  estimation  programs  can  be  used  with 
a  test  consisting  of  only  synonym  Items  and  a  test  consisting  of  only  antonym 
items.  FST  can  then  be  used  to  represent  and  estimate  bivariate  analogy  Item 
response  functions.  After  some  preliminary  definition^  additional  details  are 
given. 

Homogenous  subtests  and  ad  hoc  coordinates: 

Latent  trait  theory  provides  a  way  to  precisely  state  what  is  meant 
by  a  homogeneous  subtest  and  Items  requiring  different  amounts  of  unobserved 
abilities.  As  before  the  population  of  examinees  Is  denoted  by  a  point 
set  n  *  {w}  .  In  this  situation,  abilities  map  examinees  into  2-vectors 
rather  than  numbers:  9(w)  *  <9^ (w) .©g (uj)>  .  Because  examinees  are  randomly 

sampled  e  is  a  random  vector. 

To  quantify  the  notion  of  homogeneous  subtests,  a  pair  of  number 
valued  functions  $,4;  are  considered.  The  first  subtest  is  homogeneous  in 
the  sense  that  it  satisfies  a  local  independence  assumption  relative  to  $(e)  . 


In  symbols,  for  Items  »1r  •  on  the  first  subtest 

r 

Prob{u,  a  l,u4  ■  1,...  and  u,  a1|$(0)  *  t>  a  n  P*(t) 

U  12  V  j-l 

where  P^(t)  a  Prob{u^  a  1|<fr(0)  a  t}  . 

J  j 

A  similar  equation  expresses  the  assumption  that  the  second  subtest  is 
homogeneous  with  respect  to  <J»{0):  The  item  scores  for  items  in  the  second 
subtest  are  independent  in  the  subpopulation  of  examinees  with  the  property 
<j>(0)  3  s  for  each  constant  s  . 

Note  that  these  conditions  permit  using  available  parameter  estimation 
programs  to  calibrate  each  subtest  separately.  Somewhat  paradoxically,  each 
item  is  essentially  multidimensional  but  each  subtest  satisfies  the  axioms 
of  one  dimensional  latent  trait  theory.  FST  provides  a  method  for  integrating 
the  subtests  and  modelling  the  analogy  Items  that  also  depend  on  fluency  and 
abstraction,  but  to  an  unknown  extent. 

The  functions  $  and  y  are  called  "ad  hoc  coordinates."  If  for  each 
<s,t>  there  Is  at  most  one  vector  <x,y>  satisfying 

s  a  <f>(x,y) 
t  a  i^(x,y) 

and  certain  regularity  conditions  are  met  then  $  and  can  be  treated  as 
curvilinear  coordinates  for  the  set  of  (bivariate)  abilities. 

The  FST  analysis  to  be  presented  only  gives  a  representation  of  item 
response  function  in  terms  of  ad  hoc  coordinates 


P1  (s,t)  a  Prob{u^  a  1|<J>(9)  a  s  &  <j>(9)  a  t} 


For  many  modelling  problems  this  Is  adequate.  Admittedly  the  variables  91  (fluency) 
and  9,  (abstraction)  are  considerably  more  interesting  than  and  \j> 

Conjoint  measurement  or  uniform  systems  analysis  (Levine,  1970)  may 
permit  one  to  analyze  the  relation  between  and  6.|  ,02  .  However,  our 

current  concern  is  with  the  psychometric  problem  of  representing  new 
items  in  the  ad  hoc  coordinate  system. 

Formula  Score  Approach  to  Representing  New  Item  Types 

Consider  a  twenty-one  item  test  consisting  of  10  synonym  items  followed 
by  10  antonym  items  and  one  analogy  item.  Our  task  is  to  compute 

Prob(u21  *  1|*(0)  *  s  4  4>(9)  ■  t>  *  P(s,t)  . 

We  propose  to  first  analyze  the  homogeneous  subtests  separately. 

The  most  direct  approach  would  be  to  first  calibrate  the  synonym  items  by 
embedding  them  in  a  large  conventional  administration  of  many  items  of  the 
same  type.  The  analysis  would  yield  $  Item  characteristic  curves 

Prob{ui  *  l|<j>(0-j,e2)  *  t}  *  P.(t)  1  *  1,2,. ..10. 

Similarly  a  separate  analysis  of  the  antonym  items  yields  *  item 
characteristic  curves 

Prob{ui  *  l|*(e1,e2)  *  t)  -  Pi(t)  1  ■  11,12 . 20  , 

To  discover  the  shape  of  an  analogy  item's  ICC,  a  21  item  test 
would  be  administered  to  a  sample  of  examinees.  By  an  obvious  generalization 


6.5 


in  Section  5,  the  analogy  item  ICC 


P(s,t)  *  Prob{u21  =  1 1 4>(e1  ,e2)  3  s  and  ^(e-j  »02^  a  t} 


can  be  represented  as  a  ratio  of  densities 


P(s,t) 


f+(s,t) 
f  (s,t) 


P 


where  f+  is  the  (bivariate)  density  in  the  subpopulation  of  item  21  passers, 
f  is  the  unconditional  density. 

Before  continuing  this  analysis,  the  CS  for  the  20  item  test  is  described. 
It  turns  out  that  the  assumptions  imply  that  the  canonical  space  of  the  twenty 
item  test  is  simply  the  "tensor  product"  of  the  CS  for  the  first  subtest 
and  the  second  subtest. 

At  this  point  it  seems  advisable  to  restate  some  definitions.  The 
CS  for  the  first  subtest  is  the  set  of  one-dimensional  regression  functions 


E[S|0(ere2)  =  t]  =  R$(t) 

where  S  is  a  statistic  whose  value  depends  on  the  first  ten  item 
scores  only.  The  CS  for  the  antonym  item  is  the  set  of  regression 
functions  ECS ( ^ ( »©2 )  *  t]  for  statistics  S  that  are  functions 
of  the  second  ten  scores  only. 

The  CS  for  the  first  twenty  item  subtest  will  be  the  set  of  regression  functions 


R$(s,t)  *  E(S|*(ere2)  s  s  and  tp(s1  ,e2 )  *  t) 

for  the- statistics  S  of  the  first  20  scores.  Our  psychometric  assumptions 
imply  that  any  R{s,t)  in  the  20  item  CS  can  be  written  as  a  finite  sum 
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of  functions  of  the  form  h(s)h'(t)  for  h  in  the  first  subtest 
CS  and  h'  in  the  second  subtest  CS.  In  fact  if  {hj .t^,. . . ,hj} 
is  a  basis  for  the  first  CS  and  {hj .hg,. • . ,hj, }  for  the  second 
CS  then  the  J  x  J'  functions 

hj(s)hj' (t)  *  hjj'(s.t) 

j  *  1 ,2 , . . . ,  J 
j1  *  1,2,...  J1 

will  be  a  basis  for  the  two-dimensional  CS. 

Current  one  dimensional  FS  programs  can  be  used  to  estimate  the  bivariate 
densities  f+  and  f.  The  estimate  will  have  form 

+  J  J*  _ 

f  (s.t)  =  z  z  X. ,,hi .,(s,t) 

j»l  JJ  JJ 

where  the  sample  mean  X-^,  is  a  consistent,  unbiased  estimator  of 

E[hjj,(0)].  If  f+  and  f  are  in  the  20  item  CS,  then  (by  sample 
splitting  to  estimate  f+  and  f  separately)  a  consistent  estimate 
of  P(s,t)  is  easily  specified,  and  the  shape  of  the  item  response 
surface  can  be  "discovered." 

The  success  of  this  approach  requires  f+  and  f  to  be  in  or 
near  the  20  item,  two-dimensional  CS.  This  assumption  seems 
plausible  when  one  considers  the  variety  of  shapes  that  can  be 
constructed  as  linear  combinations  of  the  J  x  J'  coordinate  functions. 


AN  INTRODUCTION  TO 
MULTILINEAR  FORMULA  SCORE  THEORY 


Section  Seven:  Are  Two  Tests  Measuring 
the  Same  Trait(s)? 


A  colleague  (Fritz  Orasgcw)  has  convinced  me  that 
I  am  asking  the  wrong  question  In  this  section. 
Different  tests  generally  measure  different  traits 
the  important  question  is,  "How  different  are  the 
traits?"  If  two  abilities  measured  by  a  pair  of 
tests  are  equal,  then  the  probability  of  sampling 
an  examinee  with  w'^  *5e  zero*  Therefore 

we  have  been  attempting  to  quantify  the  difference 
between  abilities  by  using  the  results  in  Section 
Six  to  estimate  bivariate  distributions  and  the 
expectations  of  functions  of  bivariate  abilities 

such  as  E[(e.-6  )2]  . 
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Suppose  a  major  revision  is  made  of  a  complex,  not  necessarily 
unidimensional  test.  Does  the  new  test  measure  the  same  traits?  Suppose 
the  format  or  mode  of  administration  is  changed.  Does  the  test  still 
measure  the  same  traits?  Suppose  a  translation  of  the  test  is  attempted 
into  another  language  and  that  it  is  unlikely  that  every  original  test 
item  is  psychometrically  equivalent  to  its  translation.  Can  the  translated 
test  nonetheless  measure  the  same  traits  as  the  original?  These  questions 
have  led  us  to  ask  the  foundations  of  measurement  question, 

What  necessary  and/or  sufficient  conditions  must  item  scores 
obey  before  it  can  be  concluded  that  two  nonparallel  tests 
are  measuring  the  same  trait(s)? 

It  is  hoped  that  theoretical  work  on  this  problem  will  lead  a  statistical 
test  that  can  be  used  in  applications.  This  section  relates  FST  to  the  problem 
and  reports  some  current  work. 

Two  nonparallel  tests  are  administered  to  the  same  population.  (The  tests 
can  be  thought  of  as  subtests  of  one  test.)  Item  response  curves  are 
fitted  to  each  test  separately.  In  otherwords,  except  possibly  to  compute 
an  equating  transformation,  only  test-one  scores  are  used  to  estimate 
the  IRF  of  a  test-one  item.  Can  the  variable  in  the  first  set  of  IRF's 
be  given  the  same  interpretation  as  the  variable  in  the  second  set  of  IRF's  ? 

The  mathematical  kernel  of  this  problem  seems  to  be  this.  A  probability 
measure  is  given  for  a  set  Q  along  with  two  sets  of  zero-one  random  variables 


^1  *  u2 *  "  *  * 


u£,  •  .  .  U'. 


Two  sets  of  real  functions 


are  also  given.  What  conditions  must  be  assumed  in  order  for  it  to  be 
possible  to  construct  one  more  random  variable  9  such  that  the  P's  are  item 
response  functions  for  the  u's  relative  to  0  and  all  n  +  n'  u's  are  locally 
independent  relative  to  6  . 

A  strong  necessary  condition  on  the  given  scores  and  functions  can 
be  formulated  with  FS  notation  and  concepts.  Let  (vk>  denote  the  elementary 
scores  in  the  first  set  of  u's  (Section  4).  Let  {v£^}  denote  the 
elementary  scores  in  the  second  set  of  u's  .  Let  R  and  R  *  be  the 

vk  V 

corresponding  products  of  P's  .  Thus  if 


v.  =  ^  u  .  .  .  u 

K  ’l  ’2  V 


then  R„( • )  =  P.(-)  P.(-)  . 


•  ?A-)  . 

r 


Let  (vk  k^}  denote  the  elementary  formula  for  the  n  +  n'  item  test  where 

vk  k>  is  vkv^  .  Let  Rk  k*  denote  the  corresponding  product  of  functions 

for  vk  k>  ,  i.e.,  Rk  ^  ~  Ry  Ry*  .  And  let  {hj}.^  be  an  orthonormal  basis 

k  k 

for  the  linear  span  of  the  {Rk  k*}  .  For  example,  the  hj  could  be  obtained 
by  analyzing  the  function  H(s,t)  defined  by 


n  n' 

n  [i  +  ?As)?At)l  n  [i  +  p;(s)p;(t)] 
1-1  1  1  1=1  11 


as  in  Section  4.  Finally,  let  X  be  the  random  variable 


which  can  be  more  conveniently  computed  as  described  in  Section  4. 

It  is  easy  to  show  the  following  condition  on  expected  values  of  scores 
is  necessary 

J 

(1)  E(vkvk.)  -  Z  E(Xj)(hjiRkik-) 

for  all  elementary  formula  scores  •  • 

The  condition  also  appears  to  be  sufficient,  although  a  proof  is  not 
on  hand  at  this  time.  In  any  event,  additional  work  is  needed  to  determine 
when  one  random  variable  suffices  for  ft  and  specifed  subpopulations  of  ft  . 
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University  of  Notre  Daae 
Notre  Oaee,  IN  46556 

1  Dr.  Saeuel  T.  Nayo 
Loyola  University  of  Chicago 
820  North  Hichigan  Avenue 
Chicago,  IL  60611 

1  Hr.  Robert  HcKinley 
Aaerican  College  Testing  Prograes 
P.0.  Box  168 


Private  Sector 

1  Dr.  Barbara  Heans 
Huean  Resources  Research  Organization 
300  North  Washington 
Alexandria,  VA  22314 

1  Dr.  Robert  Hislevy 
711  Illinois  Street 
Geneva,  IL  60134 

1  Dr.  W.  Alan  Nicewander 
University  of  Oklahoaa 
Departeent  of  Psychology 
Oklahoaa  City,  OK  73069 

1  Dr.  Donald  A  Norean 
Cognitive  Science,  C-015 
Univ.  of  California,  San  Diego 
La  Jolla,  CA  92093 

1  Dr.  Jaees  Olson 
WICAT,  Inc. 

1875  Scuth  State  Street 
Orea,  UT  84057 

1  Wayne  H.  Patience 
Aaerican  Council  on  Education 
GED  Testing  Service,  Suite  20 
One  Dupont  Cirle,  NW 
Washington,  DC  20036 

1  Dr.  Jaaes  A.  Paulson 
Portland  State  University 
P.D.  Box  751 
Portland,  OR  97207 

1  Dr.  Jues  N.  Pellegrino 
University  of  California, 

Santa  Barbara 
Dept,  of  Psychology 
Santa  Barabara  ,  CA  93106 

1  Dr.  Steven  E.  Poltrock 
Bell  Laboratories  2D-444 
600  Mountain  Ave. 

Hurray  Hill,  NJ  07974 

1  Dr.  Hark  D.  Reckase 
ACT 

P.  0.  Box  168 
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Private  Sector 

1  Dr.  Thoaas  Reynolds 
University  of  Texas-Dallas 
Marketing  Departaent 
P.  0.  Box  688 
Richardson,  TK  75080 

1  Cr.  Laarer.ee  Rudner 
403  El  a  Avenue 
Takoaa  Pari:,  MO  20012 

1  Dr.  J.  Ryan 
Departaent  of  Education 
University  of  South  Carolina 
Coluebia,  SC  29208 

1  PROF.  FifKIKC  SAMEJIMA 
DEPT.  OF  PSYCHOLOGY 
UNIVERSITY  OF  TENNESSEE 
KNOXVILLE,  TN  37914 

1  Dr.  Nalter  Schneider 
Psychology  Departaent 
403  E.  Daniel 
Chaapaign,  IL  41820 

1  Lovell  Schoer 
Psychological  &  Quantitative 
Foundations 
College  of  Education 
University  of  loaa 
loaa  City,  IA  52242 

i  Dr.  Kazuo  Shiqeaasu 
7-9-24  Kugenuaa-Kaigan 
Fujusaaa  251 
JAPAN 

1  Dr.  Milliaa  Sias 
Center  for  Naval  Analysis 
200  North  Beauregard  Street 
Alexandria,  VA  22311 

1  Dr.  H.  Mel  1  ace  Sinai kc 
Frograa  Director 

Nanpoaer  Research  and  Advisory  Services 
Saithsonian  Institution 
801  North  Pitt  Street 
Alexandria,  VA  22314 

1  Martha  Stocking 
Educational  Testing  Service 
Princeton,  NJ  08541 


Private  Sector 

1  Dr.  Peter  Stoloff 
Center  for  Naval  Analysis 
200  North  Beauregard  Street 
Alexandria,  VA  22311 

1  Dr.  Milliaa  Stout 
University  of  Illinois 
Departaent  of  Hathesatics 
Urbane,  IL  41801 

1  Dr.  Hariharan  Saarirathan 
Laboratory  of  Psychometric  and 
Evaluation  Research 
School  of  Education 
University  of  Massachusetts 
Aeherst,  HA  01003 

i  Dr.  Kikuei  Tatsuoka 
Computer  Based  Education  Research  Lab 
252  Engineering  Research  Laboratory 
Urbana,  1l  41801 

1  Dr.  Maurice  Tatsuoka 
220  Education  Bldg 
1310  S.  Sixth  St. 

Chaapaign,  IL  41820 

1  Dr.  David  Thissen 
Departaent  of  Psychology 
University  of  Kansas 
Laarence,  KS  44044 

I  Dr.  Douglas  Toane 
Uni v.  of  So.  California 
Behavioral  Technology  Labs 
1845  S.  Elena  Ave. 

Redondo  Beach,  CA  90277 

1  Dr.  Robert  Tsutakaaa 
Departaent  of  Statistics 
University  of  Missouri 
Coluebia,  HD  45201 

1  Dr.  Ledyard  Tucker 
University  of  Illinois 
Departaent  of  Psychology 
403  E.  Daniel  Street 
Champaign,  IL  41S2C 

1  Dr.  V.  R.  R.  Uppuluri 
Union  Carbide  Corporation 
Nuclear  Division 
P.  C.  Box  Y 
Oak  Ridge,  TN  57B3C 
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Private  Sector 

1  Dr.  David  Vale 
Assessment  Systems  Corporation 
2233  University  Avenue 
Suite  310 

St.  Paul,  Rtt  55114 

1  Dr.  Howard  Mainer 
Division  of  Psychological  Studies 
Educational  Testing  Service 
Princeton,  NJ  08540 

1  Dr.  Hichael  T.  Nailer 
Department  of  Educational  Psychology 
University  of  Nisconsin--Hilwaukee 
Milwaukee,  N!  53201 

1  Dr.  Brian  Naters 
HumRRO 

300  North  Washington 
Alexandria,  VA  22314 

1  Dr.  David  J.  Weiss 
N&60  Elliott  Hall 
University  of  Minnesota 
75  E.  River  Road 
Minneapolis,  HN  55455 

1  Dr.  Rand  R.  Wilcox 
University  of  Southern  California 
Department  of  Psychology 
Los  Angeles,  CA  90007 

1  Dr.  Bruce  Williams 
Department  of  Educational  Psychology 
University  of  Illinois 
Urbana,  IL  MB01 

1  Ms.  Marilyn  Ningersky 
Educational  Testing  Service 
Princeton,  NJ  08541 

1  Dr.  Wendy  Ten 
CTB/McSram  Hill 
Del  Monte  Research  Park 
Monterey,  CA  93940 


